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ABSTRACT 


The CHIME/FRB Project has recently released its first catalog of fast radio bursts (FRBs), contain- 
ing 492 unique sources. We present results from angular cross-correlations of CHIME/FRB sources 
with galaxy catalogs. We find a statistically significant (p-value ~ 1074, accounting for look-elsewhere 
factors) cross-correlation between CHIME FRBs and galaxies in the redshift range 0.3 S z < 0.5, 
in three photometric galaxy surveys: WISExSCOS, DESI-BGS, and DESI-LRG. The level of cross- 
correlation is consistent with an order-one fraction of the CHIME FRBs being in the same dark matter 
halos as survey galaxies in this redshift range. We find statistical evidence for a population of FRBs 
with large host dispersion measure ( 400 pecm~3), and show that this can plausibly arise from gas 


Corresponding author: Masoud Rafiei-Ravandi 


mrafieiravandi@perimeterinstitute.ca 


in large halos (M ~ 10!^ M5), for FRBs near the halo center (r < 100 kpc). These results will improve 
in future CHIME/FRB catalogs, with more FRBs and better angular resolution. 


Keywords: Radio transient sources (2008), Large-scale structure of the universe (902), High energy 


astrophysics (739), Cosmology (343) 


1. INTRODUCTION 


Fast radio bursts (FRBs) are millisecond flashes of ra- 
dio waves whose dispersion is beyond what we expect 
from Galactic models along the line of sight. The ori- 
gin of FRBs is still a mystery, despite over a decade 
of observations and theoretical exploration (see, e.g. 
Cordes & Chatterjee 2019; Petroff et al. 2019; Platts 
et al. 2019). The Canadian Hydrogen Intensity Mapping 
Experiment / Fast Radio Burst Project (CHIME/FRB; 
CHIME/FRB Collaboration 2018) has recently released 
its first catalog of FRBs containing 492 unique sources 
(CHIME/FRB Collaboration 2021), increasing the num- 
ber of known FRBs by a factor ~4.' This unprecedented 
sample size is a new opportunity for statistical studies 
of FRBs. 

The angular resolution of CHIME/FRB is not suffi- 
cient to associate FRBs with unique host galaxies, ex- 
cept for some FRBs at very low DM, for example a re- 
peating CHIME FRB associated with M81 (Bhardwaj 
et al. 2021). This appears to put some science questions 
out of reach, such as determining the redshift distribu- 
tion of CHIME FRBs. 

However, with large enough catalogs of both FRBs 
and galaxies, it is possible to associate FRBs with galax- 
ies statistically, using angular cross-correlations. Intu- 
itively, if the angular resolution 0, of an FRB exper- 
iment is too large for unique host galaxy associations, 
there will still be an excess probability (relative to a ran- 
dom point on the sky) to observe FRBs within distance 
~ 0 of a galaxy. Formally, this corresponds to a cross- 
correlation between the FRB and galaxy catalogs, which 
we will define precisely in 83. By measuring the correla- 
tion as a function of galaxy redshift and FRB dispersion 
measure (DM) (defined below), the redshift distribution 
and related properties of the FRB population can be 
constrained, even in the absence of per-object associa- 
tions. 

FRB-galaxy cross-correlations have been proposed in 
a forecasting context (McQuinn 2014; Masui & Sigurd- 
son 2015; Shirasaki et al. 2017; Madhavacheril et al. 
2019; Rafiei-Ravandi et al. 2020; Reischke et al. 2021a; 


lFor a complete list of known FRBs, see https://www. 
herta-experiment.org/frbstats (Spanakis-Misirlis 2021) or the 


Transient Name Server (TNS, Petroff & Yaron 2020). 


Alonso 2021; Reischke et al. 2021b), and applied to 
the ASKAP and 2MPZ/HIPASS catalogs by Li et al. 
(2019). In this paper, we will use machinery devel- 
oped by Rafiei-Ravandi et al. (2020) for modeling the 
FRB-galaxy cross-correlation, and disentangling it from 
propagation effects. This machinery uses the halo model 
for cosmological large-scale structure (LSS); for a review 
see Cooray & Sheth (2002). 

Before summarizing the main results presented here, 
we recall the definition of FRB DM. FRBs are dispersed: 
the arrival time at radio frequency v is delayed, by an 
amount proportional to v-?. The dispersion is propor- 
tional to the DM, defined as the free electron column 
density along the line of sight: 


DM = EO dz. (1) 


Since FRBs have not been observed to have spectral 
lines, FRB redshifts are not directly observable. How- 
ever, the DM is a rough proxy for redshift (Macquart 
et al. 2020). We write the total DM as the sum of con- 
tributions from our Galaxy and halo (DMg4), the IGM 
(DMiaw), and the FRB host galaxy and halo (DMhpost): 


DM = DMegal + DMiem(z) + DMnos: - (2) 


The IGM contribution DMjaw(z) is given by the Mac- 
quart relation: 


12-2 


DMiem(z) = neo | dz faz gr ; 


(3) 
where fa(z) is the mean electron ionization fraction at 
redshift z, Neo = 2.13 x 1077 cm? is the comoving elec- 
tron density, and H(z) is the Hubble expansion rate. If 
fa is assumed independent of redshift, then Eq. (3) has 
the following useful approximation: 


DMiaw (2) © (1000 pc cm?) faz. (4) 


We checked that this approximation is accurate to 6% 
for z < 3, assuming that helium reionization is complete 
by z = 3. By default, we assume f, = 0.9, which implies 
DMjcam(z) © 900z pe cm™?. 

We briefly summarize the main results of the pa- 
per. We find a statistically significant correlation be- 
tween CHIME FRBs and galaxies in the redshift range 
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0.8 S z S 0.5. The correlation is seen in three photo- 
metric galaxy surveys: WISExSCOS, DESI-BGS, and 
DESI-LRG (described in 82.2). The statistical signifi- 
cance of the detection in each survey is p ~ (2.7 x 1075), 
(3.1 x 1074), and (4.1 x 1072), respectively. These p- 
values account for look-elsewhere effects, in both angular 
scale and redshift range. The observed level of correla- 
tion is consistent with an order-one fraction of CHIME 
FRBs inhabiting the same dark matter halos as galaxies 
in these surveys. CHIME/FRB does not resolve halos, 
so we cannot distinguish between FRBs in survey galax- 
ies and FRBs in the same halos as survey galaxies. 

We study the DM dependence of the FRB-galaxy cor- 
relation and find a correlation between high-DM (extra- 
galactic DM > 785 pecm~?) FRBs and galaxies at z ~ 
0.4. This implies the existence of an FRB subpopulation 
with host DM > 400 pccm~?. Such large host DMs have 
not yet been seen in observations that directly associate 
FRBs with host galaxies. To date, 14 FRBs (excluding 
a Galactic magnetar, see CHIME/FRB Collaboration 
(2020a); Bochenek et al. (2020)) have been localized to 


host galaxies, all of which have DMpost €; 200 pecm™?. 


In 84.2, we explain why these observations are not in 
conflict. We also show that host DMs > 400 pc cm~? 
can arise from ionized gas in large (M > 101^ Mọ) dark 
matter halos, if FRBs are located near the halo center 
(r S; 100 kpc). 

'This paper is structured as follows. In 82, we describe 
the observations and data reduction. Clustering results 
are presented in 83 and interpreted in 84. We conclude 
in 85. Throughout, we adopt a flat ACDM cosmology 
with Hubble expansion rate h = 0.67, matter abundance 
Om = 0.315, baryon abundance Qa, = 0.048, initial 
power spectrum amplitude A, = 2.10 x 1079, spectral 
index n, = 0.965, neutrino mass » 7, m, = 0.06 eV, and 
CMB temperature Tomp = 2.726 K. These parameters 
are consistent with Planck results (Aghanim et al. 2020). 


2. DATA 
2.1. FRB catalog 


The first CHIME/FRB catalog is described in 
(CHIME/FRB Collaboration 2021). In order to max- 
imize localization precision and to simplify selection 
biases, we include only a single burst with the high- 
est significance for each repeating FRB in this anal- 
ysis. This treats repeating and nonrepeating FRBs 
as a single population. In future CHIME/FRB 
catalogs with more repeaters, it would be interest- 
ing to analyze the two populations separately. In 
CHIME/FRB, there is currently no evidence that 
repeaters and nonrepeaters have different sky dis- 
tributions (CHIME/FRB Collaboration 2021). We 


also exclude three sidelobe detections (FRB20190210D, 
FRB20190125B, FRB20190202B), leaving a sample of 
489 unique sources. We do not exclude FRBs with 
excluded_flag=1, indicating an epoch of low sensitiv- 
ity, since we expect the localization accuracy of such 
FRBs to be similar to the main catalog. 

Throughout this paper, all DM values are extragalac- 
tic. That is, before further processing of the CHIME 
FRBs, we subtract the Galactic contribution DMga) 
from the observed DM. The value of DMga) is esti- 
mated using the YMW16 (Yao et al. 2017) model. 
In §4.2, we show that using the NE2001 (Cordes & Lazio 
2002) model does not affect results qualitatively. The 
CHIME/FRB extragalactic DM distribution is shown 
in Figure 1. 

We do not subtract an estimate of the Milky Way 
halo DM, since the halo DM is currently poorly con- 
strained by observations. The range of allowed val- 
ues is roughly 10 € DMygao € 100 pecm^?, and 
the (dipole-dominated) anisotropy is expected to be 
small (Prochaska & Zheng 2019; Keating & Pen 2020). 
'The results of this paper are qualitatively unaffected by 
the value of DMhaio- 

The CHIME/FRB pipeline assigns a nominal sky lo- 
cation to each FRB based on the observed signal-to- 
noise ratio (SNR) in each of 1024 formed beams. In 
the simplest case of an FRB that is detected only in 
a single formed beam, the nominal location is the cen- 
ter of the formed beam. For multibeam detections, the 
nominal location is roughly a weighted average of the 
beam centers (CHIME/FRB Collaboration 2019, 2021). 
Statistical errors on CHIME/FRB locations are difficult 
to model, since they depend on both the details of the 
CHIME telescope and selection biases that depend on 
the underlying FRB population. We discuss this fur- 
ther in 83.1 and Appendix A. 


2.2. Galaxy catalogs 


On the galaxy side, we have chosen five photomet- 
ric redshift catalogs: 2MPZ, WISExSCOS, DESI-BGS, 
DESI-LRG, and DESI-ELG. Note that the DESI cata- 
logs are the photometric target samples for forthcoming 
spectroscopic DESI surveys with the same names. Ta- 
ble 1 summarizes key properties of our reduced samples 
for the cross-correlation analysis, and the redshift dis- 
tributions are shown in Figure 1. 

The 2MASS Photometric Redshift (2MPZ) catalog 
(Bilicki et al. 2013) contains ~1 million galaxies with 
z S 0.3 (redshift error ø ~ 0.02), enabling the con- 
struction of a 3D view of LSS at low redshifts (see, e.g. 
Alonso et al. 2015; Balaguera-Antolínez et al. 2018). In 
this work, we use the mask made by Alonso et al. (2015) 


[Zmin;Zmax| Zmed Nea  NrnB 
[0.0, 0.3) 0.08 670,442 323 
[0.0,0.5] 0.16 6,931,441 310 
(0.05, 0.4] 0.22 5,304,153 183 
(0.3, 1.0) 0.69 2,331,043 183 
(0.6, 1.4) 1.09 5,314,194 62 
(0.05, 1.0] 0.28 7,690,819 183 


Survey feky 
2MPZ 0.647 
WISExSCOS 0.638 
DESI-BGS 0.118 
DESLLRG 0.118 
DESLELG 0.055 
BGS-LRG 0.118 


Table 1. Galaxy survey parameters: sky fraction fsky 
(not accounting for CHIME/FRB coverage), redshift range 
[Zmin, Zmax], median redshift zmea, total number of unmasked 
galaxies Ngai, and number of FRBs Nprp overlapping the 
survey. The “BGS+LRG” catalog is used only in 83.5, and 
consists of all unique objects from the DESI-BGS and DESI- 
LRG catalogs. 


for the 2MPZ catalog. Following Bilicki et al. (2013), we 
discard galaxies whose K-band magnitude is below the 
completeness limit mg, = 13.9. 

'The WISExSuperCOSMOS photometric redshift cat- 
alog (WISExSCOS, Bilicki et al. 2016) contains 
^20 million point sources with z < 0.5 (o; ~ 0.03) 
over 7096 of the sky, making it a versatile dataset for 
cross-correlation studies. In this work, we use a slightly 
modified catalog (Krakowski et al. 2016), which includes 
probabilities (pgaiDstar;Dqso) for each object to be a 
galaxy, star, or quasar, respectively. We use objects 
with peal > 0.9, which is consistent with the weighted 
mean purity of identified galaxies across the W1 band 
(Krakowski et al. 2016). We use a standard mask? to 
remove the Galactic foreground, Magellanic Clouds and 
bright stars. Additionally, we mask out regions that are 
contaminated visually owing to their proximity to the 
Galactic plane: 


(|b| € 20°) and ((0° < 1 < 30°) or (330° < 1 < 360°)), 
(|b| < 18°) and ((30° < 1 < 60°) or (300° < 1 < 330°)), 
(|b| € 17°) and (0° <1 < 360°). (5) 


The Dark Energy Spectroscopic Instrument (DESI) 
Legacy Imaging Surveys (Dey et al. 2019) were designed 
to identify galaxies for spectroscopic follow-up. We use 
the catalogs from the DR8 release, with photometric 
redshifts from Zhou et al. (2020a). Following DESI, 
we consider three samples: the Bright Galaxy Survey 
(BGS), the Luminous Red Galaxy (LRG) sample, and 
the Emission Line Galaxy (ELG) sample, correspond- 
ing to redshift ranges 0.05 < z < 0.4, 0.3 < z < 1, and 
0.6 € z € 1.4 respectively (Figure 1). 


? http://ssa.roe.ac.uk/WISExSCOS.html 


For each of the three DESI samples, we define sur- 
vey geometry cuts as follows. For simplicity, we restrict 
to the northern part of the survey (Dec > 322375, b > 
+17°), which contains ~2 times as many CHIME/FRB 
sources as the southern part. Note that the northern 
and southern DESI surveys are obtained from different 
telescopes and may have different systematics. For the 
DESI-ELG sample, we impose the additional constraint 
b > 4-45? in order to mitigate systematic depth vari- 
ations. We restrict to sky regions that were observed 
at least twice in each of the {g,r, z} bands (Zhou et al. 
2020a). We mask bad pixels, bright stars, large galax- 
ies, and globular clusters using the appropriate DESI 
bitmask.? 

In addition to these geometric cuts, we impose per- 
object cuts on the DESI catalogs by removing point-like 
objects (TYPE-PSF), and applying the appropriate color 
cuts for each of the three surveys. Color cuts for the 
BGS, LRG, and ELG catalogs are defined by Ruiz- 
Macias et al. (2020), Zhou et al. (2020b), and Raichoor 
et al. (2020) respectively. For BGS, we include both 
"faint" (19.5 < r < 20) and “bright” (r < 19.5) galax- 
ies (terminology from Ruiz-Macias et al. 2020). For 
BGS and LRG, we exclude objects with poorly con- 
strained photometric redshifts (Zpnot,sta > 0.08). Our 
final BGS, LRG and ELG samples have typical redshift 
error oz ~ 0.03, 0.04, and 0.15 respectively. 


3. FRB-GALAXY CORRELATION RESULTS 


In this section, we describe our pipeline for comput- 
ing the FRB-galaxy cross power spectrum. The pipeline 
consists of mapping sources onto a sky grid and then 
computing the spherical harmonic transform and the 
angular power spectrum. Error bars are assigned using 
mock FRB catalogs. 


3.1. Pipeline overview 


Our central statistic is the angular power spectrum 
ol ?. a Fourier-space statistic that measures the level of 
correlation between the FRB catalog f and galaxy cata- 
log g, as a function of angular wavenumber ¢. Formally, 
C? is defined by 


(af, aor a) = Cl bee Smm! , (6) 


where an. is the spherical harmonic transform of catalog 
Y € {f,g} (the all-sky analog of the Fourier transform 
on the flat sky). Intuitively, a detection of nonzero o? I 


3 MASKBITS 1, 5-9, and 11-13, defined here: https://www. 


legacysurvey.org/dr8/bitmasks/ 
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Figure 1. Left panel: Redshift distributions for the five galaxy samples in this paper (82.2). Right panel: FRB extragalactic 
DM distributions for the CHIME/FRB catalog (solid) and for the subset of the CHIME/FRB catalog that overlaps spatially 


with each galaxy survey. 


at wavenumber / corresponds to a pixel-space angular 
correlation at separation 0 ~ £1. 

'The power spectrum c$ ? is not the only way of rep- 
resenting a cross-correlation between catalogs as a func- 
tion of scale. Another possibility is the correlation 
function ¢(@), obtained by counting pairs of objects 
whose angular separation 0 lies in a set of nonover- 
lapping bins. This method was used by Li et al. 
(2019) to correlate ASKAP FRBs with nearby galax- 
ies. The power spectrum cf ? and correlation function 
C(0) are related to each other by the Legendre trans- 
form ¢(0) = (2 + 1)/(4x)C]? P;(cos0). Therefore, 
CÍ? and ¢(@) contain the same information, and the 
choice of which one to use is a matter of convenience. 
We have used the power spectrum cf I since it has the 
property that nonoverlapping /-bins are nearly uncorre- 
lated, making it straightforward to infer statistical sig- 
nificance from plots. 

Throughout the paper, it will be useful to have a 
model FRB-galaxy power spectrum e ? in mind. In Fig- 
ure 2, we show c$ ? for a galaxy population at z ~ 0.4, 
calculated using the “high-z” FRB model from Rafiei- 
Ravandi et al. (2020), with median FRB redshift z = 
0.76. The main features of C/? are as follows: 


e The leftmost peak at £ ~ 10? is the two-halo term 
cf : (2n). which arises from FRBs and galaxies in 
different halos. The two-halo term does not probe 
the details of FRB-galaxy associations; it arises 
because FRBs and galaxies both inhabit halos, and 
halos are clustered on ~ 100 Mpc scales (the cor- 
relation length of the cosmological density field). 


0 [arcmin] 
104 103 10? 10! 109 
— — One-halo term ofer) 
2h) 


—.— Two-halo term cfs 


0 mal : 
109 10! 10? 103 104 


Figure 2. Model FRB-galaxy power spectrum OF ? from 
83.1, for a galaxy population near z ~ 0.37 and FRB angular 
resolution 1'. Note that we have plotted (/C/*), for consis- 
tency with later plots in the paper. In this and later plots in 
the paper, the angular scale on the top axis is 0 = 7/0, and 
is intended to provide an intuitive mapping between angular 
multipole @ and an angular scale. 


e The rightmost peak at £ ~ 10? is the one-halo term 
of 90^) which is sourced by (FRB, galaxy) pairs 
in the same dark matter halo. 


e For completeness, we note that for £ > 104, there 
is a “Poisson” term (not shown in Figure 2) that is 
sourced by FRBs in catalog galaxies (not elsewhere 
in the halo). CHIME/FRB’s limited angular reso- 
lution suppresses er ? at high £, hiding the Poisson 
term. Intuitively, this is because CHIME/FRB 
cannot resolve different galaxies in the same dark 
matter halo. 


Although the one-halo and two-halo terms look com- 
parable in Figure 2, the SNR of the one-halo term is 
a few times larger. In this paper, we do not detect 
the two-halo term with statistical significance (see Fig- 
ure 7). Therefore, throughout the paper we will often 
neglect the two-halo term, and make the approximation 
CF a ofo, 

The one-halo term pr is constant in £ for £ < 10°, 
and suppressed for ( 2 10?. (Note that in Figure 2, 
we have plotted er 9, for consistency with later fig- 
ures in the paper.) The high-/ suppression arises from 
two effects: (1) statistical errors on FRB positions (the 
CHIME/FRB “beam”), and (2) displacements between 
FRBs and galaxies in the same dark matter halo. 

Within the statistical errors of the cf J measurement 
in this paper, both effects can be modeled as Gaussian, 
i.e. high-@ suppression of the form e-C/P. 


CLIO) = ae IL | (7) 


where we have omitted the two-halo term since we do 
not detect it with statistical significance. In principle, 
the value of L in Eq. (7) is computable, given models 
for statistical errors on CHIME FRB sky locations and 
FRB/galaxy profiles within dark matter halos. How- 
ever, FRB halo profiles are currently poorly constrained, 
and CHIME FRB location errors are difficult to model, 
since they depend on both instrumental selection effects 
and details of the FRB population. In Appendix A, we 
explore modeling issues in detail and show that a plau- 
sible (but conservative, i.e. wide) range of L-values is 
315 < L < 1396. 

Summarizing the above discussion, our pipeline works 
as follows. We measure the angular power spectrum 
cf 7 from the FRB and galaxy catalogs, and fit the 
é-dependence to the template form ce = ae? /L? 
Eq. (7). We treat the amplitude a as a free param- 
eter, and vary the template scale L over the range 
315 < L x 1396, to evaluate the correlation amplitude 
as a function of scale. 


in 


3.2. Overdensity maps 


'Turning now to implementation, the first step in our 
pipeline is to convert the FRB and galaxy catalogs into 
“overdensity” maps ôs (x), 6, (x), defined by 


1 
2d 
ny Qpix 


ôy (x) = (Nyex- Nye). — (8) 
Here, Y € {f,g} denotes a catalog, x denotes an angu- 
lar pixel, Nye denotes the number of catalog objects 
in pixel x, and Nye, denotes the expected number of 
catalog objects in pixel x due to the survey geometry. 


The prefactor 1/(n220),,,) is conventional, where n2 is 
P Y *^p Y 


the 2D number density and Qpix is the pixel area. For 
CHIME/FRB, the expected number density N fex de- 
pends on declination (Dec). The definition (8) of 6 f(x) 
weights each pixel x proportionally to the expected num- 
ber of FRBs. This weighting is optimal since the FRB 
field is Poisson noise dominated (Cf! ~ 1 [ne]. 

The difference between a density map and an overden- 
sity map is the second term Ñ in Eq. (8), which removes 
spurious density fluctuations due to the survey geome- 
try We compute the N-term differently for different 
catalogs as follows. 

For the three DESI catalogs, we estimate N using 
“randoms” from the DESI-DR8 release, i.e. simulated 
catalogs that encode the survey geometry, with no spa- 
tial correlations between objects. We use random cat- 
alogs from the DESI-DRS data release (source density 
n2! = 5000 deg ?), and apply the DESI “geometry” 
cuts from the previous section. 

For the other two galaxy surveys (2MPZ and 
WISExSCOS), random catalogs are not readily avail- 
able, so we represent the survey geometry by an angular 
HEALPix (Gorski et al. 2005) mask, and assume uni- 
form galaxy density outside the mask: 


Nogex = (9) 


- n2? Üpix if x is unmasked 
0 if x is masked 


The mask geometries for 2MPZ and WISExSCOS were 
described previously in 82.2. 

Finally, for the CHIME/FRB catalog, computing N 
deserves some discussion. The CHIME/FRB number 
density N is inhomogeneous, peaking near the north ce- 
lestial pole. To an excellent approximation, the number 
density is azimuthally symmetric in equatorial coordi- 
nates, i.e. independent of right ascension (RA) at fixed 
declination, because CHIME is a cylindrical drift-scan 
telescope oriented north-south (CHIME/FRB Collabo- 
ration 2021). Therefore, we make random FRB catalogs 
that represent N by randomizing RAs of the FRBs in 
the observed catalog, leaving declinations fixed. When 
making randoms, we also loop over 1000 copies of the 
CHIME/FRB catalog, so that the random catalogs are 
much larger than the data catalog (appropriately rescal- 
ing N and n7? in Eq. 8). 

In Figure 3, we show overdensity maps óy (x) for the 
CHIME/FRB sources and the galaxies. These maps are 
useful as visual checks for systematic effects, before cat- 
alogs are cross-correlated. For example, if the Galactic 
mask is not conservative enough, the overdensity map 
may show visual artifacts with ôg < 0, since Galactic 
extinction will suppress the observed catalog density N, 
relative to N. No visual red flags are seen in either the 
CHIME/FRB or galaxy maps, even without a Galactic 
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mask for CHIME/FRB. This is consistent with Jose- 
phy et al. (2021), who found no evidence for Galactic 
latitude dependence in the CHIME/FRB number den- 
sity after correcting for selection effects. As described 
in §2.2, we do apply a Galactic mask in our pipeline, so 
even if the FRB catalog does contain low-level biases in 
the Galactic plane, they should be mitigated. 


3.3. Estimating the power spectrum oy 


We estimate or ? in our pipeline by taking spherical 
transforms of the overdensity maps ôs (x), 0, (x), to get 
spherical harmonic coefficients a}, and af,,. Then, we 
estimate the power spectrum OT I as 


£ 
^ 1 1 
fg _ ` ) f* og 
C; m n 2f 4. 1 emt? (10) 
sky m=—£ 


where fie, is the fractional sky area subtended by the in- 


tersection of the FRB and galaxy surveys. The 3, pref- 
actor normalizes the power spectrum estimator to have 
the correct normalization on the partial sky. Through- 
out the main analysis, we represent overdensities as 
HEALPix maps with 1'7 resolution (Nae = 2048), 
and estimate the power spectrum to a maximum mul- 
tipole of max = 2000, corresponding to angular scale 
0 — T] lmas = 5:4. 

We assign error bars to the power spectrum e ? using 
Monte Carlo techniques, simulating mock FRB catalogs 
and cross-correlating them with the real galaxy catalogs. 
We simulate mock FRB catalogs by keeping FRB decli- 
nations the same as in the real catalog, but randomizing 
right ascensions. This mimics the logic used to construct 
random FRB catalogs in 83.2. In fact, the only differ- 
ence in our pipeline between a “mock” and a “random” 
FRB catalog is the number of FRBs: a mock catalog 
has the same number of FRBs as the data, whereas a 
random catalog has a much larger number. Concep- 
tually, there is another difference between mocks and 
randoms: mocks should include any spatial clustering 
signal present in the real data, whereas randoms are un- 
clustered and only represent the survey geometry. For 
FRBs, spatial clustering is small compared to Poisson 
noise (cff RI ijn}, see Figure 5), so we can make the 
approximation that clustering is negligible. 


3.4. Statistical significance and look-elsewhere effect 


In Figure 4, we show the angular power spectrum G y 
for a set of nonoverlapping £ bins. A weak positive FRB- 
galaxy correlation is seen at 500 < £ < 1000 in some of 
the galaxy surveys. In this subsection, we will address 
the question of whether this correlation is statistically 
significant. 


As explained in 83.1, we will fit the FRB-galaxy cor- 
relation to the template cf’ = ae” /U | treating the 
amplitude a as a free parameter, and varying the tem- 
plate scale L over the range 315 < L < 1396. Let us 
temporarily assume that L is known in advance. In this 
case, an optimal estimator for o is 

1 -p? 


^ € ^ 
áp-l— > QUE D egg CP (11) 


where C/ was defined in Eq. (10), and the normaliza- 
tion Vr, is defined by 
e 28/P 
NL-2 M Que Yaa: (12) 


l>lmin 


We have included a cutoff at min = 50 to mitigate 
possible large-scale systematics. This is a conservative 
choice, since Figure 5 does not show evidence for system- 
atic power in the auto power spectrum coff for £ 2 15. 
Eq. (11) is derived by noting that 


offop + (ef 
2l+1 
" 
e , 
2£ +1 
where the first line follows from Wick’s theorem, and the 
second line follows since (C7)? < CffC39, and Cf! is 
nearly constant in £. 
We define the quantity 


Var(Cf9) x 


(13) 


âL 


SNR; =F 
PU. Var(@,)1/2’ 


(14) 
which is the statistical significance of the p I detection 
in “sigmas”, for a fixed choice of L. In Figure 6, we 
show the quantity SNR, as a function of scale L. 

We pause for a notational comment: throughout the 
paper, £ denotes a multipole (as in cf’), and L denotes 
the template scale defined in Eq (7). The value of SNR, 
(or âz) is obtained by summing Cf? over £ < L, as in 
Eq. (11). When C/? is computed as a function of £ 
(Figure 4), neighboring £ bins are nearly uncorrelated, 
whereas when SNR;, is computed as a function of L 
(Figure 6), nearby L-values are highly correlated. 

In Figure 6, it is seen that SNR; can be as large as 
2.67, for a certain choice of L and galaxy survey (namely 
DESI-BGS at L = 1396). However, it would be incorrect 
to interpret this as a 2.670 detection, since the value of 
L has been cherry-picked to maximize the signal. 

To quantify statistical significance in a way that ac- 
counts for the choice of L (the “look-elsewhere effect"), 
we restrict the search to 315 < L < 1396 and define 


SNRmax = max 


SNR;. (15) 
315€ L«1396 
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Figure 3. CHIME/FRB overdensity map ór(x), and galaxy overdensity maps óg(x) for each galaxy survey. Maps are shown 
in Mollweide projection, centered on | = 180? in the Galactic coordinate system, after applying the angular masks used in 
the analysis pipeline. To interpret the color scale, note that by Eq. (8), each object in a pixel contributes 1/(n?fQpix) to the 


overdensity dy. 


For fixed L, SNR; is approximately Gaussian dis- 
tributed, and represents statistical significance in “sig- 
mas”. Since SNR» ax is obtained by maximizing over 
trial L-values, SNRmax is non-Gaussian, and we assign 
statistical significance by Monte Carlo inference. 

In more detail, we compare the “data” value of 
SNRaax (e.g. SNRmax = 2.67 for DESI-BGS) to an en- 
semble of Monte Carlo simulations, obtained by cross- 


correlating mock FRB catalogs with the real galaxy cat- 
alog as in §3.3. We assign a p-value by computing the 
fraction of mocks with SNR(99*9 > SNR(d2t9, We find 
p = 0.0166 for DESI-BGS, i.e. evidence for a correlation 
at 98.34% CL after accounting for the look-elsewhere ef- 
fect in L. The p-values for the other galaxy surveys are 
shown in Figure 6. 
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Figure 4. FRB-galaxy cross power spectrum Qj ? in a set 
of nonoverlapping £ bins delimited by vertical lines, with lo 
error bars. Data points are shifted slightly from the center 
of corresponding / bins for visual clarity. Here, we have used 
all galaxies in the catalogs; if we restrict the redshift ranges, 
then the correlation is more significant (Figure 7). 


Our interpretation is that this level of evidence is in- 
triguing, but not high enough to be conclusive. There- 
fore, we do not interpret the FRB-galaxy correlation 
in Figures 4 and 6 as a detection. However, in the 
next subsection we will restrict the redshift range of the 
galaxy catalog (accounting for the look-elsewhere effect 
in choice of redshift range) and find a high-significance 
detection. 


3.5. Redshift dependence 


'To illustrate our method for studying redshift depen- 
dence, we will use the WISExSCOS galaxy catalog as 
a running example. Suppose we cross-correlate CHIME 
FRBs with WISExSCOS galaxies above some minimum 
redshift zmin, where Zin is a free parameter that will be 
varied. For each Zmin, we repeat the analysis of the pre- 
vious subsection. The power spectrum cf I(zmin) and 
quantity SNRz(Zmin) (defined in Eq. 14) are now func- 
tions of two parameters: Zmin and template scale L. 

In the top panels of Figure 7, we show the power spec- 
trum cf I(zmin) for the fixed choice of redshift zmin = 
0.3125, and SNR; (2min) as a function of L and Zmin. For 
specific parameter choices, we see a large FRB-galaxy 
correlation, e.g. SNRL(zmin) = 4.88 at L = 543 and 
Zmin = 0.3125. As in the previous subsection, this would 
imply a 4.880 cross-correlation for these cherry-picked 
values of (L, Zmin), but does not account for the look- 
elsewhere effect in choosing these values. 

To assign statistical significance in a way that ac- 
counts for the look-elsewhere effect, we use the same 
method as the previous subsection, except that we now 
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Figure 5.  Angular auto power spectrum UI for the 
CHIME/FRB catalog. Transparent lines represent 100 mock 
FRB catalogs that spatially model the real data. Through- 
out the analysis, we assume that the power spectrum aj f 
approaches a constant (dashed line) on small scales (high £). 
Specifically, C ~ 1/n?4 for 315 < £ < 1396. 


scan over two parameters (L, Zmin) rather than one (L). 
Formally, we define 


SNRinax = SNRz(Zmin), (16) 


max max 
0<2min<0-5 315<D<1396 
analogously to Eq. (15) from the previous subsection. To 
assign bottom-line statistical significance, we would like 
to rank the “data” value SNRmax = 4.88 within a his- 
togram of SNRmax values obtained by cross-correlating 
mock FRB catalogs with the galaxy catalog. How- 
ever, with 10^ simulations, we find that none of the 
mock catalogs actually exceed SNRmax = 4.88, so we 
fit the tail of the SNR max distribution to an analytic 
distribution (a truncated Gaussian), and compute the 
p-value analytically. For details of the tail-fitting proce- 
dure, see Appendix C. We obtain detection significance 
p = 2.7 x 107° for WISExSCOS with zmin = 0.3125. 
This analysis “scans” over minimum redshift Zmin and 
scale L, and the significance fully accounts for the look- 
elsewhere effect in these parameters. 

Similarly, we get p = 3.1 x 1074 for DESI-BGS with 
Zmin = 0.295, scanning over Zmin and L. For DESI- 
LRG, we use a maximum redshift zmax instead of a mini- 
mum redshift Zin, since DESI-LRG is at higher redshift 
than WISExSCOS or DESI-BGS (Figure 1). Scanning 
over Zmax and scale L, we obtain p = 4.1 x 1074 with 
Zmax = 0.485 for DESI-LRG. These results are shown in 
Figure 7. 

Finally, we find borderline evidence p — 0.0421 (L — 
1396, zmax = 0.86) for a cross-correlation between DESI- 
ELG galaxies (varying Zmax) and CHIME FRBs with 
DM > 500 pecm~?, where the choice of minimum DM 
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Figure 6. Quantity SNRz, defined in Eq. (14), as a func- 
tion of template scale L. As explained in 83.4, SNR; is the 
statistical significance of the FRB-galaxy correlation in “sig- 
mas", for a fixed choice of L. The p-values in the legend are 
bottom-line detection significances after accounting for the 
look-elsewhere effect in L. Here, we have used all galaxies 
in the catalogs; if we restrict the redshift ranges, then the 
detection significance is higher (Figure 7). 


is fixed. To justify this choice of DMmin, note that since 
host DMs must be positive, we do not expect a cor- 
relation between DESI-ELG galaxies (zmin = 0.6) and 
CHIME FRBs with DM < 500 pccm ? (allowing for 
statistical fluctuations in DMyqm on the order of 40 
pccm ?). We do not find any statistically significant 
detection with 2MPZ. 

These results are consistent with a simple picture 
in which the FRB-galaxy correlation mainly comes 
from galaxies in redshift range 0.3 $2 $0.5. For 
WISExSCOS and DESI-BGS, the maximum survey 
redshifts are 0.5 and 0.4 respectively, and we find a 
strong detection when we impose a minimum redshift 
Zmin ^v 0.3. For DESI-LRG, the minimum survey red- 
shift is 0.3, and we find a strong detection when we 
impose a maximum redshift zmax ~ 0.5. The border- 
line detection in DESI-ELG and nondetection in 2MPZ 
are also consistent with this picture, in the sense that 
these catalogs do not overlap with the redshift range 
0.3 S z S 0.5. 

As a direct way of seeing that the FRB-galaxy cor- 
relation is sourced by redshift range 0.3 S z < 0.5, in 
Figure 8 we cross-correlate the FRB catalog with the 
combined BGS+LRG catalog (Table 1, bottom row) in 
nonoverlapping redshift bins with 0.05 < z < 1. It is 
seen that the cross-correlation is driven by redshift range 
0.3 S z € 0.5. (The bin at z ~ 0.75 is nonzero at 2.20, 


which we interpret as borderline statistical significance, 
since there are 10 bins.) 

In Appendix B, we examine the robustness of these 
results using null tests and do not find any evidence for 
systematic biases. 


4. INTERPRETATION 


So far, we have concentrated on establishing statis- 
tical significance of the FRB-galaxy correlation, in a 
Monte Carlo simulation pipeline that accounts for look- 
elsewhere effects. In this section, we will interpret the 
FRB-galaxy correlation, and explore implications for 
FRBs. 

As explained in 83.1, the output of our pipeline is a 
constraint on the coefficient a in the template fit: 

CP ODE gae PT (17) 
where the factor e- ^ /L^ is a Gaussian approximation to 
the high-¢ suppression due to FRB/galaxy profiles and 
the instrumental beam. 

At several points in this section, we will want to com- 
pare our FRB-galaxy correlation results to a model for 
QM. To do this, we intepret the low-4 limit of the 
model as a prediction for the coefficient œ above. For- 
mally, we define 

a = lim C909 (18) 
£0 
and compare this model prediction for a to the value 
of (&r)r21000, where the estimator âz was defined in 
Eq. (11). For simplicity we will fix L = 1000, since this 
gives a high-significance detection of the FRB-galaxy 
correlation in all three galaxy surveys (see Figure 7). 


4.]. Link counting 


In this subsection, we will interpret the amplitude 
of the FRB-galaxy correlation of I in an intuitive way. 
First, we fix a galaxy catalog and redshift range. As a 
definition, we say that an FRB is linked to a galaxy if 
they are in the same dark matter halo. For each FRB 
f, we define the link count np by 


nf = number of survey galaxies linked to FRB f. 
(19) 
Given an FRB catalog, we define the mean link count n: 


n= (nr), (20) 
where the expectation value (-) is taken over FRBs in 
the catalog. 

To connect these definitions with our FRB-galaxy cor- 
relation results, we note that: 


a= lim d; m. = e (21) 
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Figure 7. FRB-galaxy correlation analysis with two parameters: template scale L (defined in Eq. 7), and a redshift endpoint 
(either zmin for WISExSCOS and DESI-BGS, or zmax for DESI-LRG). Left column: Angular cross power spectrum Ce and 
auto power spectrum C??, for the fixed choice of redshift endpoint that maximizes FRB-galaxy correlation. The cross power 
“fit” is a best-fit template of the form (ua =e Right column: Quantity SNRxz, defined in Eq. (14), as a function 
of L and redshift endpoint. As explained in 83.4, SNRz is statistical significance of the FRB-galaxy correlation in “sigmas”, 
for a fixed choice of L and redshift endpoint. The p-values in the legend are bottom-line significance after accounting for the 
look-elsewhere effect in these choices (see §3.5). 
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Figure 8. Redshift dependence of the FRB-galaxy correla- 
tion. We divide the BGS+LRG catalog into nonoverlapping 
redshift bins (dotted lines) and cross-correlate with CHIME 
FRBs. The quantity (âz )z=1000 on the y-axis is a measure 
of the level of cross-correlation, defined in Eq. (11). 


where the first equality is Eq. (18), and the second 
equality follows from a short halo model calculation 
(Rafiei-Ravandi et al. 2020). That is, the amplitude a 
of the FRB-galaxy correlation (in the one-halo regime) 
is equivalent to a measurement of the mean link count 
n. This provides a more intuitive interpretation of the 
amplitude. 

In each row of Table 2, we specify a choice of galaxy 
catalog and redshift range. The redshift ranges have 
been chosen to maximize cfs, as in §3.4. In the third 
column, we give the constraint on « obtained from the 
estimator âz, at L = 1000. In the last column, we have 
translated this constraint of a to a constraint on 7, using 
Eq. (21). 

Taken together, the 7 measurements in Table 2 show 
that the CHIME/FRB catalog has mean link counts of 
order unity with galaxies in the range 0.3 € z € 0.5. 
The precise value of 7 depends on the specific galaxy 
survey considered. Note that different galaxy surveys 
will have different values of 7, since the number of galax- 
ies per halo (and to some extent the population of halos 
that is sampled) will be different. 

Since FRBs outside the redshift range of the galaxy 
catalog do not contribute to 7, we write 7 = prj, where p 
is the probability that an FRB is in the catalog redshift 
range and 7) is the mean link count of FRBs that are in 
the catalog redshift range. 

For the galaxy surveys considered here, we expect 7) 
to be of order unity, since dark matter halos rarely con- 
tain more than a few catalog galaxies. To justify this 
statement, we note that C7? is ~ 2 times larger than 


the Poisson noise 1/n24 in the one-halo regime (see Fig- 
ure 7). By a link counting argument similar to Eq. (21), 
this implies that (NZ) ~ 2(N,), where N, is the num- 
ber of galaxies in a halo, and the expectation values are 
taken over halos. 

Since 7 = pi] is of order unity (by Table 2), and 1j is 
of order unity (by the argument in the previous para- 
graph), we conclude that p is of order unity. That is, an 
order-one fraction of CHIME FRBs are in the redshift 
range 0.3 S z S 0.5. 

We have phrased this conclusion as a qualitative state- 
ment (“order-one fraction") since it is difficult to assign 
a precise upper bound to 7. More generally, it is difficult 
to infer the FRB redshift distribution (dn#“/dz) from 
the FRB-galaxy correlation in the one-halo regime, since 
the level of correlation is proportional to i (dn? /dz), 
with no obvious way of disentangling the two factors. 
Future CHIME/FRB catalogs should contain enough 
FRBs to detect the FRB-galaxy correlation on two-halo 
scales ( ~ 100) (Rafiei-Ravandi et al. 2020), which will 
help break the degeneracy and measure (dn? /dz) and 
7) separately. 


4.2. DM dependence 


In Figure 9, we divide the FRB catalog into extra- 
galactic DM bins and explore the DM dependence of 
the FRB-galaxy cross-correlation. 

A striking feature in Figure 9 is the nonzero cor- 
relation in the three highest-DM bins, corresponding 
to extragalactic DM > 785 pccm^?.^ For reference, 
the last three bins represent 7%, 6%, and 15% of the 
CHIME/FRB catalog, respectively. At the redshift of 
the galaxy surveys (z ~ 0.4), the IGM contribution to 
the DM is DMjqu(z) ^ 360 pecm~%. Therefore, the 
observed FRB-galaxy correlation at DM > 785 pe cm~? 
is evidence for a subpopulation of FRBs with host DMs 
of order DMpost ~ 400 pe cm™®?. 

'This may appear to be in tension with recent direct as- 
sociations between FRBs and host galaxies, which have 
typically been studied only for lower-redshift FRBs. At 
the time of this writing, 14 FRBs have been localized to 
host galaxies," all of which have DMnost S; 200 pc cm? 
(Spitler et al. 2016; Bassa et al. 2017; Chatterjee et al. 
2017; Kokubo et al. 2017; Tendulkar et al. 2017; Ban- 


^ A technical comment here: for some DM bins in Figure 9, the 
large values of cfe lead to link counts 7 that are a few times 
larger than the link counts reported in Table 2 for the whole cat- 
alog, although statistical errors are large. However, the correla- 
tion coefficient between the FRB and galaxy fields is never larger 
than 1. In all cases, the field-level correlation DI AOI TOUS 
is of order 0.01 or smaller. 

5 https://frbhosts.org/#explore 
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Survey [Zmin, Zmax] (aL) r-1000 n2* [sr | n 
WISExSCOS [0.3125, 0.5] (4.35 +0.97) x 107  9.92x 104 0.432 £ 0.096 
DESI-BGS [0.295, 0.4] (2.69 + 0.67) x 10-9 7.94 x 10? 2.13 + 0.53 
DESI-LRG [0.3, 0.485] (3.94 + 0.93) x 10-9 —— 2.83 x 10° 1.11 + 0.26 


Table 2. Clustering analysis in 84.1. The FRB-galaxy clustering statistic az (Eq. 11) can be translated to a constraint on 
1j, the average number of survey galaxies in the same dark matter halo as a CHIME/FRB source (see text for details). The 
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Figure 9. DM dependence of the FRB-galaxy correlation. 
We divide the CHIME/FRB catalog into DM bins (delimited 
by vertical lines) after subtracting the YMW16 estimate of 
the Galactic DM and cross-correlate each DM bin with the 
galaxy catalogs. The last DM bin extends to DM = 3020 
pcem ?. For each galaxy survey, we use the same redshift 
range (see legend) as in the left panel of Figure 7. The quan- 
tity (&r)r—1i0oo on the y-axis is defined in Eq. (11) and mea- 
sures the level of FRB-galaxy correlation. This quantity is a 
per-object statistic that is derived from "d ?. Hence, it does 
not necessarily follow number density variations in Figure 1. 


nister et al. 2019; Prochaska et al. 2019; Ravi et al. 
2019; Chittidi et al. 2020; Heintz et al. 2020; Law et al. 
2020; Macquart et al. 2020; Mannings et al. 2021; Mar- 
cote et al. 2020; Simha et al. 2020; Bhandari et al. 
2020a,b; CHIME/FRB Collaboration 2020b; Bhardwaj 
et al. 2021; James et al. 2021a). The rest of this section 
is devoted to interpreting this result further. 

In Figure 9, the DM bin at 785 < DM < 916 pe cm~? 
is an outlier, suggesting a narrow feature in the DM de- 
pendence of the FRB-galaxy correlation. Given the er- 
ror bars, it is difficult to say with statistical significance 
whether the apparent narrowness is real, or whether the 
true DM dependence is slowly varying. A crucial point 
here is that the three galaxy catalogs are highly corre- 
lated spatially (after restricting to the appropriate red- 
shift ranges), which implies that the three measurements 
in Figure 9 have highly correlated statistical errors. Fu- 
ture CHIME/FRB catalogs will have smaller error bars 


and can statistically distinguish a narrow feature from 
slowly varying DM dependence. 

As a check, we remade Figure 9 using the NE2001 
(Cordes & Lazio 2002) model for Galactic DM, instead 
of the YMWI16 model. The effect of this change is small 
compared to the statistical errors in Figure 9. 

We also performed the following visual check. The 
outlier bin with 785 < DM < 916 pccem"? in Figure 9 
only contains 12 FRBs in the DESI footprint. In Fig- 
ure 10 we show the DESI-BGS galaxies in the vicinity 
of each FRB. The large FRB-galaxy correlation can be 
seen visually as an excess of galaxies (relative to ran- 
dom catalogs) within 7’ of an FRB.? None of the indi- 
vidual FRBs in Figure 9 give a statistically significant 
cross-correlation on its own, but the total FRB-galaxy 
correlation is significant at the 3c—4c level. (We caution 
the reader that the galaxy counts in Figure 10 do not 
obey Poisson statistics, since the galaxies are clustered.) 
'There are no visual red flags in Figure 10, such as a sin- 
gle FRB that gives an implausibly large contribution to 
the cross-correlation. 

Finally, we address the question of whether the high- 
DM signal in Figure 9 is consistent with direct host asso- 
ciations. Consider the following two statements, in the 
context of FRB surveys with the CHIME/FRB sensitiv- 


ity: 


1. A random FRB with extragalactic DM > 785 
pccm-^? has an order-one probability of hav- 
ing redshift z ~ 0.4 (implying DMpost 2 400 
pcem 2). 


2. A random FRB at redshift z ~ 0.4 has an order- 
one probability of having extragalactic DM > 785 
peem-^?., 

The high-DM signal in Figure 9 implies statement 1, but 
not statement 2. We will now argue that statement 1 is 
actually consistent with direct associations. 


6 The scale © = 7' was obtained as © = V8/L, where L = 1396 is 
the template scale where the DESI-BGS cross-correlation peaks 
in Figure 7. The factor //8 was derived by matching the variance 
(0?/2) of a radius-O top hat to the variance (4/ L?) of a Gaussian 
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Figure 10. Visual representation of the cross-correlation between FRBs with 785 < DM < 916 pccm ?, and DESI-BGS 
galaxies. There are 12 FRBs in this DM range in the DESI footprint. For each such FRB, we plot the DESI-BGS galaxies in 
the redshift range 0.295 « z « 0.4 in the vicinity of the FRB. We color-code galaxies by redshift, but note that redshift errors 
are comparable (oz ~ 0.03) to the redshift range shown. The gray points are objects in the DESI random catalog, to give a 
sense for the DESI mask geometry. The dashed circles are centered at FRBs, with radius © = 7' (see 84.2). The value of Ng 
in the upper left is the observed number of galaxies in the circle. The value of Nexp is the expected number of galaxies in the 
circle, inferred from randoms. The FRB-galaxy correlation appears as a statistical preference for Ng > Nexp. 


'The key point is that there are few direct associations 
at high DM. Out of the 14 direct associations to date, 
only one has extragalactic DM > 785 pccm-?: an FRB 
with YMW16-subtracted DM 850 pccm-^? at z = 0.6 
(Law et al. 2020). Based on this one high-DM event, one 
cannot rule out statement 1 above (note that statement 
2 would clearly be inconsistent with direct associations). 

Therefore, there is no inconsistency between the high- 
DM FRB-galaxy correlation in Figure 9, and direct FRB 
host associations to date. The number of direct associa- 
tions is rapidly growing, and we predict that FRBs with 
extragalactic DM > 785 pc cm? at z ~ 0.4 will be found 
in direct associations soon (see 85 for more discussion). 


One final comment: we have presented statistical ev- 
idence that statement 1 is true in CHIME/FRB, but 
statement 1 depends to some extent on the selection 
function of the FRB survey. In particular, future sur- 
veys that are sensitive to fainter sources may detect 
larger numbers of high-redshift FRBs. In this scenario, 
it is possible that FRBs with extragalactic DM > 785 
pcem-^? will mostly come from z ~ 0.8, as expected 
from the Macquart relation. 


4.3. Host halo DMs 


In the previous subsection, we found statistical evi- 
dence for a population of FRBs at z ~ 0.4 with DM = 
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400 pc cm-?. In this section we will propose a possible 
mechanism for generating such large host DMs. Note 


that for a Galactic pulsar, a DM of order 400 pecm 


-8 


would be unsurprising, but pulsar sight lines lie prefer- 
entially in the Galactic disk (boosting the DM), whereas 
FRBs are observed from a random direction. 


Bright galaxies in cosmological surveys are usually 


found in large dark matter halos (Wechsler & Tinker 
2018). Therefore, FRBs that correlate with such galax- 
ies may have large host DMs, due to DM contributions 
from gas in the host halos. We refer to such a contri- 
bution as the host halo DM DMyy, since the term “halo 
DM" is often used to refer to the contribution from the 
Milky Way halo. 
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Figure 11. Host halo DM distributions for FRBs in a halo 
of mass M = 10!'^ Ms. The host halo DM is determined by 
two parameters: the distance r between the FRB and halo 
center, and viewing angle 0. Each histogram corresponds to 
one choice of r, with 10? values of 0. The halo gas profile is 
the “ICM” model from Prochaska & Zheng (2019). 


Can host halo DMs plausibly be of order DMpn Z 400 


pcem-?? To answer this question, in Figure 11, we 
show DMpn histograms for simulated FRBs in a halo of 
mass M = 10!4M,. The halo gas profile is the intra- 
cluster medium (ICM) model from Prochaska & Zheng 
(2019), based on X-ray observations from Vikhlinin et al. 
(2006)7. It is seen that FRBs near the centers (r < 100 
kpc) of large (M ~ 10!^M5) halos can have host halo 
DMs DMuy, > 400 peem=3. 


7 To calculate the host halo DM DMyy, = farne(r), we used a 
slightly modified version of the FRB software (github.com/FRBs/ 
FRB) by Prochaska et al. We thank the authors for making their 
software public. 


Thus, the high-DM signal in Figure 9 is plausibly ex- 
plained by a small subpopulation of FRBs at redshift 
0.3 S z S 0.5 near the centers of large halos. Such a 
subpopulation could have DMpost Z 400 pecm@3, and 
strongly correlate with galaxies, since bright galaxies are 
often in high-mass halos. 

This mechanism is a proof of concept to show that 
DMpr = 400 pecm^? is plausible in some halo gas mod- 
els. Other mechanisms may also be possible, such as 
augmentation by intervening foreground galaxies (James 
et al. 2021b). We emphasize that the statistical evidence 
for a population of FRBs with DM, > 400 pcem-^?, 
presented in the previous subsection, does not depend 
on the assumption of a particular model or mechanism. 


4.4. Propagation effects 


So far, we have assumed that the observed FRB- 
galaxy correlation is owing to spatial correlations be- 
tween the FRB and galaxy populations. In this sub- 
section, we will explore the alternate hypothesis that 
host DMs are always small (say DMhost ~ 70 pc cm™?), 
and that propagation effects are responsible for the ob- 
served correlation between z ~ 0.4 galaxies and high- 
DM FRBs. 

“Propagation effects” is a catch-all term for what hap- 
pens to radio waves during their voyage from source and 
observer due to intervening plasma. For example, dis- 
persion, scattering, and plasma lensing are all propaga- 
tion effects. Propagation effects can produce an appar- 
ent correlation between low-redshift galaxies and high- 
redshift FRBs, even when the underlying populations 
are not spatially correlated. 

For example, low-redshift galaxies are spatially corre- 
lated with free electrons, which contribute to the DM 
of background FRBs. The DM contribution can either 
increase or decrease the probability of detecting a back- 
ground FRB, depending on the selection function of the 
instrument. This effect can produce an apparent cor- 
relation or anticorrelation between low-z galaxies and 
high-z FRBs, in the absence of any spatial correlation 
between the galaxy and FRB populations. 

Here, we will calculate contributions to cf ? from 
propagation effects, using formalism from Rafiei- 
Ravandi et al. (2020). We will use a fiducial model 
in which host DMs are small (DMpost ~ 70 pcem ?), 
implying negligible spatial correlation between z ~ 0.4 
galaxies and high-DM FRBs. This is because we are 
interested in exploring the hypothesis that propagation 
effects (not large host DMs) are entirely responsible for 
the observed DM dependence in Figure 9. We describe 
the fiducial model in the next few paragraphs. 
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Figure 12. DM distribution (solid curve) for the fiducial 
FRB model used to study propagation effects in 84.4, with 
the CHIME/FRB DM distribution shown for comparison 
(histogram). In this model, host DMs are small, to explore 
the hypothesis that the correlation between z ~ 0.4 galax- 
ies and high-DM FRBs is due to propagation effects, rather 
than large host DMs. The host DM distribution (not shown) 


is sharply peaked at DMnosc ~ 70 pc cm" ?, 


First, we model the distribution of FRBs in redshift 
and DM. We assume that the FRB redshift distribution 
is 


—L xag (22) 


and that the host DM distribution is lognormal, and 
independent of redshift: 


1 (log DMnost = Hig) 
DMhos E 
x n o) » DMhost oe | ibe 
(23) 
In Eqs. (22), (23), we choose parameters 


= 6.7 Hlog = 4.2 Clog = 5. (24) 


The total DM is DM = DMiaw(z) + DMhost- These 
parameters have been chosen so that the median FRB 
redshift is 0.4, the median host DM is 67 pccm^?, and 
the distribution of total DMs is similar to the observed 
DM distribution in Figure 12. 

We will also need a fiducial model for P,-(k), the 3D 
galaxy-electron power spectrum at comoving wavenum- 
ber k. For reasons that we will explain shortly, we 
will need to know the one-halo contribution in the limit 
k — 0, which is (Rafiei-Ravandi et al. 2020) 

. NO» 
lim Pye (k, z) = a ; (25) 
where (N°?) is the average (over survey galaxies) num- 
ber of electrons in the halo containing a galaxy, and neo 


is the comoving electron number density. To compute 
(Nie), we assume that survey galaxies are contained 
in dark matter halos whose mass Mp is lognormal- 
distributed, with parameters: 


(A) = 13.4 ao(à) = 0.35, (26) 


where A = logig(Mn/Mo). This distribution is a 
rough fit to the halo mass distribution shown in Fig- 
ure 3 of Schaan et al. (2021) for SDSS-LOWZ, a well- 
characterized z ~ 0.3 galaxy survey similar to the ones 
considered here. We assume that these large halos 
have baryon-to-matter ratio equal to the cosmic aver- 
age (Q,/Qm), with ionization fraction fẹ = 0.75. 

Finally, we model the CHIME/FRB selection func- 
tion S(DM) in DM. This has been measured via Monte 
Carlo analysis of simulated events, and the result is 
shown in Figure 14 of the CHIME/FRB Catalog 1 paper 
(CHIME/FRB Collaboration 2021). Here, we will use 
the following rough visual fit: 


log S(DM) — 0.1 — 0.14 [ls (saa) Í . (27) 


The selection function S(DM) is, up to normalization, 
the probability that a random FRB with a given DM is 
detected by CHIME/FRB. As an aside, CHIME/FRB 
has a selection bias against detecting high-DM FRBs 
due to frequency channel smearing and a bias against de- 
tecting low-DM FRBs due to the details of the high-pass 
filtering used in radio frequency interference removal. 
(Scattering biases will be discussed later in this section.) 
'This combination of biases results in the selection func- 
tion (Eq. 27) with a local maximum at DM ~ 1000 
pcem 2. 

With the fiducial model in the previous few para- 
graphs, we now proceed to calculate contributions to 
er ? from propagation effects. 

The first propagation effect we will consider is “DM- 
completeness" , described schematically as follows. Con- 
sider a foreground population of galaxies, and a back- 
ground (i.e. higher-redshift) population of FRBs. The 
galaxies are spatially correlated with ionized electrons, 
which increase DMs of the FRBs, by adding dispersion 
along the line of sight. This can either increase or de- 
crease the apparent number density of FRBs, depending 
on whether dS/d(DM) is positive or negative. This com- 
bination of effects produces a correlation between num- 
ber densities of FRBs and galaxies, i.e. a contribution 
to C$? that can be positive or negative. 
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In Rafiei-Ravandi et al. (2020), the contribution to 
eT ? from DM-completeness is calculated: 


ch- aje E 2 E W la) (sue), 
(28) 


where the DM-completeness weight function W for DM 
bin [DMmin, DMmax] is 


"T 
Wyle) = 22 cl dz 


nit H 
DM mas d2n24 dlog S 
f g 
‘ i ADM)LCNDM)4DM) 9) 


and x(z) is comoving distance to redshift z. We convert 
this expression for p I to an expression for our param- 
eter o as follows: 


a = lim C909 
(50 £ 
1 H(z) dn?" (Ni?"(z)) 
= d I Ww e ; 30 
n24 MIO dz A Tle,0 (30) 


where we have used Eq. (18) in the first line and 
Eqs. (25), (28) in the second line. 

The second propagation effect we will consider is 
^DM-shifting", which arises for an FRB catalog that has 
been binned in DM, as in Figure 9. Even in the absence 
of an instrumental selection function, DM fluctuations 
along the line of sight can shift FRBs across DM bin 
boundaries, either increasing or decreasing the observed 
number density of FRBs in a given bin. This effect is dis- 
tinct from the DM-completeness effect described above, 
and also produces a contribution to 67 I that can be 
positive or negative. Using results from Rafiei-Ravandi 
et al. (2020), the DM-shifting bias to az is given by the 
previous expression (30), but with the following expres- 
sion for the DM-shifting weight function: 


pU dia? DMmax 
M= nH H(z) [ dz Tz (DM) : 
(3) 
In Figure 13, we show az-biases from the DM- 
completeness and DM-shifting propagation effects in our 
fiducial model, computed using Eqs. (29)-(31). For sim- 
plicity, we have approximated the precise z-dependence 
of the redshift-binned galaxy surveys in Figure 9 by as- 
suming dn24 dz = const for 0.3 < z < 0.4. (The results 
are not very sensitive to the galaxy redshift distribu- 
tion.) 
Comparing to the FRB-galaxy correlation shown pre- 


viously in Figure 9, we see that the total bias is ~ 0.5c 
in the second DM bin (262 < DM < 393 pc cm™?), and 
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Figure 13. Predicted contribution to the FRB-galaxy cor- 
relation az (Eq. 11) from propagation effects, in the fiducial 
model from 84.4. The DM binning is the same as Figure 9. 
Comparing to the error bars in Figure 9, the DM-shifting 
contribution is ^ 0.50 in the second and third DM bins 
(262 < DM < 393 and 393 < DM < 523 pccem ?) and 
S 0.1c in the other bins. The DM-completeness contribu- 
tion is very small. 


S 0.10 in the other bins. These biases are too small, and 
have the wrong DM dependence, to explain the FRB- 
galaxy correlation shown previously in Figure 9. 

So far, we have only considered propagation effects 
involving dispersion. The next propagation effect we 
might want to consider is scattering completeness, de- 
scribed intuitively as follows. Consider a foreground 
population of galaxies and a background population of 
FRBs. The galaxies are correlated with free electrons, 
which scatter-broaden FRBs and change their observed 
number density. Since scatter-broadening always de- 
creases the probability that an FRB is detected, this ef- 
fect always produces negative Gr ? 5 Therefore, scatter- 
ing completeness cannot be responsible for the observed 
FRB-galaxy correlation, which is positive (as expected 
for clustering). 

A final category of propagation effects is strong lens- 
ing (either plasma lensing or gravitational lensing) by 
foreground galaxies. Although strong lenses are rare, 
they can produce large magnification, increasing the de- 
tection rate of background FRBs by a large factor if the 
FRB luminosity function is sufficiently steep. A com- 
plete analysis of strong lensing in CHIME/FRB would 


8 Formally, the selection function for scattering is a decreasing 
function of scattering width. This can be seen directly in Fig- 


ure 15 of (CHIME/FRB Collaboration 2021). 
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be a substantial undertaking, and we defer it to a future 
paper. 


5. SUMMARY AND CONCLUSIONS 


In this paper, we find a cross-correlation between 
CHIME FRBs and galaxies at redshifts 0.3 S z S 0.5. 
'The correlation is statistically significant in three galaxy 
surveys: WISExSCOS, DESI-BGS, and DESLLRG. 
'The statistical significance of the detection in each sur- 
vey is p ~ 2.7 x 107?, 3.1 x 107^, and 4.1 x 107+, respec- 
tively. These p-values account for look-elsewhere effects 
in both angular scale L and redshift range. 

The FRB-galaxy correlation is detected on angular 
scales (€ ~ 1000) in the one-halo regime. In this regime, 
the amplitude of the correlation is proportional to the 
mean "link count" 7 of the FRB population, i.e. mean 
number of galaxies in the same halo as an FRB. Cross- 
correlating CHIME FRBs with 0.3 S z S 0.5 galaxies, 
we find 7 of order unity. 

This measurement of 7 cannot be directly translated 
to the probability p that an FRB is in the given redshift 
range. We can write 7 = př, where ù is the mean link 
count of FRBs in the redshift range. Formally, we mea- 
sure (pij) but not the individual factors p, 7. However, in 
the bright galaxy surveys considered here, dark matter 
halos rarely contain more than a few catalog galaxies. 
We conclude that 7 must be of order unity, implying that 
p is also of order unity. That is, an order-one fraction 
of CHIME FRBs are in redshift range 0.3 S z < 0.5. 

We have phrased this conclusion as a qualitative state- 
ment (“order-one fraction"), since it is difficult to as- 
sign a quantitative upper bound to 7. This issue is a 
limitation of measuring FRB-galaxy correlations in the 
one-halo regime, where the FRB redshift distribution al- 
ways appears multiplied by a linking factor 7. Future 
CHIME/FRB catalogs should contain enough FRBs to 
detect the FRB-galaxy correlation on two-halo scales 
(£ ~ 100) (Rafiei-Ravandi et al. 2020), which will help 
break this degeneracy. 

We find statistical evidence for a population of FRBs 
with large host DMs, on the order of DMhost ~ 400 
pccem-^?. More precisely, we detect a nonzero correla- 
tion between FRBs with DM > 785 pccm ^? (after sub- 
tracting the YWMI6 estimate of the Milky Way DM) 
and galaxies at z ^ 0.4, where the IGM contribution to 
the DM is DMica (2) ~ 360 pcem ?. 

This may appear to be in tension with direct host 
galaxy associations. At the time of this writing, 14 
FRBs have been localized to host galaxies, all of which 
have DMhnost € 200 pccm-?. However, FRBs with DM 
> 785 peccm ^? are currently uncommon, and our FRB- 
galaxy correlation result must be interpreted carefully. 


It implies that an order-one fraction of high-DM FRBs 
are at redshift z — 0.4 in CHIME/FRB, but it does 
not imply that an order-one fraction of FRBs at red- 
shift z ~ 0.4 have high DM. These statements are actu- 
ally consistent with the direct associations. Since there 
is currently only one direct association with YMW16- 
subtracted DM > 785 pccm"?, one cannot currently 
rule out the possibility that an order-one fraction of 
high-DM FRBs are at z 0.4. 

'The number of direct host associations is rapidly grow- 
ing, and we predict that direct associations will soon 
find high-DM FRBs with z ~ 0.4. However, we note 
that most direct associations to date have been discov- 
ered by ASKAP at lower DM (on average) than the 
CHIME/FRB sample. 

We briefly explore mechanisms for producing host 
DMs > 400 pccm-^?, and show that contributions from 
gas in large halos provide a plausible mechanism. Quan- 
titatively, we find that for FRBs near the centers (r < 
100 kpc) of large (M ~ 101^ M) halos the host halo DM 
can be > 400 pc cm ^? (Figure 11), at least in one widely 
used ICM model (Prochaska & Zheng 2019). FRBs in 
such halos will strongly correlate with galaxies, since 
bright survey galaxies are often found in large halos. We 
show that line-of-sight propagation effects are unlikely 
to be a significant source of bias (84.4). 

Future measurements of  FRB-galaxy cross- 
correlations will have higher SNR, and the results pre- 
sented here could be extended in several ways. One 
could bin simultaneously in galaxy redshift and FRB 
DM, to explore the FRB-galaxy correlation strength as 
a function of two variables (z, DM). Cross-correlations 
can constrain the high-z tail of the FRB redshift distri- 
bution, where direct associations are difficult since in- 
dividual galaxies are usually faint (Eftekhari & Berger 
2017). Very high-z FRBs, if present, can be used to 
constrain cosmic reionization history (Caleb et al. 2019; 
Linder 2020; Zhang et al. 2021). Finally, line-of-sight 
propagation effects will eventually be detectable in Br 2 
and will be an interesting probe of the distribution of 
electrons in the universe. 

This paper is based on FRBs from CHIME/FRB Cat- 
alog 1, which contains 489 unique sources and approxi- 
mate angular sky positions. Future CHIME/FRB cata- 
logs will include more FRB sources, many of which will 
have improved angular resolution through use of base- 
band data (Michilli et al. 2021). The FRB-galaxy corre- 
lation presented here should have much higher statistical 
significance in future CHIME/FRB catalogs and will be 
exciting to explore. 
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APPENDIX 


A. STATISTICAL ERRORS ON FRB LOCATIONS 


Statistical errors in CHIME/FRB sky locations sup- 
press the FRB-galaxy power spectrum cf ? on small 
scales (large 4). The suppression takes the form 
df I beC$ ?. where b; is the “beam” transfer function. 
Throughout the paper, we have modeled statistical er- 
rors as Gaussian, which leads to a transfer function of 
the form b, = e-*/£^, 

In this appendix, we will study statistical errors in 
more detail, using toy models of the CHIME/FRB in- 
strument and the FRB population. Our conclusions are 
as follows: 


e Statistical errors are not strictly Gaussian, but a 
Gaussian transfer function b; = e-^ /“” is a good 
approximation within the error bars of our of 3 
measurement. 


Calculating L from first principles is hard, since 
it depends on both the CHIME/FRB instrument 
and the FRB population. A plausible range of L- 
values is 315 < L < 1396. 


This justifies the methodology used throughout the pa- 
per, where a Gaussian transfer function b; = e-€/P is 
used, but L is a free parameter that we fit to the data, 


varying L over the range 315 < L < 1396. 


A.l. Toy beam model 1: uniform density, center of 
nearest beam 


CHIME FRBs are detected by searching a 4 x 256 
regular array of formed beams independently in real 
time. A best-fit sky location is assigned to each de- 
tected FRB based on the detection SNR (or nonde- 
tection) in each beam, using the localization pipeline 
described by CHIME/FRB Collaboration (2019, 2021). 
For an FRB which is detected in a single beam, the local- 
ization pipeline assigns sky location equal to the beam 
center. For a multibeam detection, the assigned sky lo- 
cation is roughly a weighted average of the beams where 
the event was detected. 

As a first attempt to model statistical errors in the 
localization pipeline, suppose that when an FRB is de- 
tected we assign it to the center of the closest FRB 
beam. 'This is a reasonable model for the single-beam 
detections as described above. 

We neglect wavelength dependence of the beam and 
evaluate at central wavelength A — 0.5 m. We also ne- 
glect FRBs in sidelobes of the primary beam, since these 
are a small fraction of the CHIME/FRB Catalog 1. Fi- 
nally, we assume that FRBs detected by CHIME/FRB 


are uniformly distributed over the sky. (This turns out 
to be a dubious approximation, as we will show in the 
next subsection.) What is b, in this toy model? 

Let ©. be the elevation of the detected FRB (with the 
usual astronomical definition, i.e. Oe = 0 for an FRB on 
the horizon, or O, = 7/2 for an FRB at zenith). Let 
0,0, be east-west and north-south sky coordinates in a 
coordinate system where the center of the formed beam 
is at (0,0). Let S be the set of points closer to (0,0) 
than any of the other beam centers: 


s-[ bo 1 | n bo To 


2'2 2sin Oe’ 2sin Oe | ' 


where 0) = 23/4 in CHIME. If the detected FRBs are 
uniformly distributed on the sky, then the effective beam 
is 

bis te d?6 Jo(£0) 
to [801 ^C 


where Jo(x) is a Bessel function. For the CHIME/FRB 
catalog, which contains FRBs with different elevations 
Oe, we average by over ©, values in the catalog. It is 
straightforward to compute the elevation ©, for each 
FRB, using values of RA, Dec, and time of observation 
taken directly from the catalog. The resulting transfer 
function bg is shown in Figure 14, and agrees well with 
a Gaussian transfer function b; = e-€/V with L = 670. 


(A2) 


A.2. Toy beam model 2: including selection bias 


In the previous subsection, we neglected a selection 
bias: an FRB is more likely to be detected if it is lo- 
cated at the center of the beam (where the instrumental 
response is largest). To account for this selection bias, 
we define the unnormalized intensity beam: 

. sinc”(0;Dz/) since? ((8,D, sin ©.) /2) 


B(0z, 0y) 2 sinc? (0.D4 /AX) l 
(A3) 


where 0,,0,,0,,AÀ are defined in 8A.1, the CHIME 
aperture is modeled as a rectangle with dimen- 
sions (D;,,D,) = (80,100) meters, and sinc(r) = 
sin(zx)/ (1x). 

Assuming a Euclidean FRB fluence distribution 
N(2 F) x F~*/? (consistent with statistical analysis of 
the CHIME/FRB Catalog 1 (CHIME/FRB Collabora- 
tion 2021)), the probability of detecting an FRB at sky 
location (0,,0,) is x B(0,,0,)?/?. Therefore, the beam 
transfer function is 


; fg PO B(0)3/? Jo (00) 
g = 
J; 420 B(0)7 


(A4) 
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Figure 14. CHIME/FRB beam transfer function be in a 
toy beam model, without (model 1, 84.1) and with (model 2, 
8A.2) selection bias included. Since by is elevation dependent, 
the result is slightly different after averaging over FRBs in 
the WISExSCOS (orange) and DESI-BGS/LRG (blue) sky 
regions. For values of £ which are resolved by the beam (say, 
bi 2 0.25), the beams are well approximated by Gaussians 


ics e ( 


dotted curves). 

averaged over catalog elevations ©, as in the previous 

subsection. The resulting transfer function b; is shown 

in Figure 14 and agrees well with a Gaussian transfer 
2 2 

function bj = e^ /*' with L = 900. 


A.3. Plausible range of L-values 


Comparing the last two subsections, we see that the 
selection bias considered in §A.2 increases the effective 
value of L by 34%. This treatment of selection bias is in- 
complete, and a full study is outside the scope of this pa- 
per. For example, be depends on wavelength A, so there 
is a selection bias involving FRB frequency spectra. In 
addition, we have not attempted to model multibeam 
detections, which will be better localized than single- 
beam detections. Given these sources of modeling un- 
certainty, rather than trying to model the value of L 
precisely, we will assign a range of plausible L-values. 

To assign a smallest plausible L-value, we make as- 
sumptions that lead to the largest plausible localization 
errors. We start with the toy beam model by from §A.2, 
with \ = 0.75m (the longest wavelength in CHIME). 
We then convolve with a halo profile (bg — beue(M, z)?, 
where u;(M, z) is a Navarro-Frenk-White (NFW) den- 
sity profile; Navarro et al. 1997), taking the halo mass 
M to be large (M = 101^? 57! M.) and the redshift to 
be small (z — 0.05). These specific values are somewhat 
arbitrary, but the goal is to establish a baseline plausible 
value of Lmin, not model a precise value of L. With the 
assumptions in this paragraph, we get Lmin = 315. 


Similarly, to assign a largest plausible L-value, we 
make assumptions that lead to the smallest plausible 
localization errors. We use the smallest toy model 
from §A.2 with A = 0.375 m (the shortest wavelength in 
CHIME). We assume that 4096 of the events are multi- 
beam detections, and that multibeam detections have 
localization errors that are smaller by a factor 3. As in 
the previous section, these specific values are somewhat 
arbitrary, but the goal is to establish a baseline plausible 
value of Lmax, not model a precise value of L. With the 
assumptions in this paragraph, we get Lmax = 1396. 


B. NULL TESTS 


As a general check for robustness of our FRB-galaxy 
correlation C/?, we would like to check that C$? does 
not depend on external variables, for example time of 
day (TOD). Our methodology for doing this is as fol- 
lows. We divide the FRB catalog into low-T'OD and 
high-TOD subcatalogs, cross-correlate each subcatalog 
with a galaxy sample, and compute the difference power 
spectrum: 


dO) SC ee (B5) 


Recall that for a non-null power spectrum Cz, we com- 
pressed the /-dependence into a scalar summary statis- 
tic âz by taking a weighted ¢-average (Eq. 11). Analo- 
gously, we compress the difference spectrum dc} ? into 
a summary statistic Br, defined by 


^ € ^ 
br= M Qu 1) dÂ} , (B6) 
£2 in £ 


where L is an angular scale parameter. Next, by analogy 
with SNR, (defined previously in Eq. 14), we define 


Bt 


Ar a es 
Var(81,)1/? 


(BT) 


The value of Aç quantifies consistency (in *sigmas") be- 
tween C7? for the low-TOD and high-TOD subcatalogs. 

We fix L = 1000, and consider three choices of 
galaxy catalog: WISExSCOS with z > 0.3125, DESI- 
BGS with z > 0.295, and DESLLRG with z < 
0.485. These redshift ranges are “cherry-picked” to 
maximize the FRB-galaxy cross-correlation (see Fig- 
ure 7), but this cherry-picking should not bias the 
difference statistic Ar. With these choices, we find 
A y, = (1.22, —0.21, 1.30} for WISExSCOS, DESI-BGS, 
and DESI-LRG respectively. Therefore, there is no sta- 
tistical evidence for dependence of c$ ? on time of day, 
since a 1.22c, 0.21c, or 1.300 result is not statistically 
significant. 
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'This test can be generalized by splitting on a variety of 
external variables (besides TOD). In Table 3, we identify 
12 such variables and denote the corresponding Ar, val- 
ues (with L — 1000) by A;, where i € (1,2,3,...,12]. 
We note that these 12 tests are nonindependent, for ex- 
ample SNR is correlated with fluence. We also note that 
for many of these tests detection of a nonzero difference 
spectrum dôf I does not necessarily indicate a problem. 
For example, DM dependence of B ? is expected at some 
level, since c$ ? is redshift dependent, and DM is corre- 
lated with redshift. 

There are a few ~2ø outliers in Table 3, but a few 
outliers are unsurprising, so it is not immediately clear 
whether the A; values in Table 3 are statistically differ- 
ent from zero. To answer this question, we reduce the 
12-component vector A; into a scalar summary statistic, 
in a few different ways as follows. 

Our first summary statistic is intended to test whether 
the most anomalous A;-value in each column of Table 3 
is statistically significant. We define 


Amax = max |A;]|. (B8) 


We then compare these values of Amax to an ensemble 
of mocks. The mocks are constructed by randomizing 
the RA of each FRB in the catalog, keeping all other 
FRB properties (DM, SNR, etc.) fixed. This preserves 
any correlations which may be present between FRB 
properties. In the special case of the |b| > 17? null test, 
we recompute the value of b after randomizing RA. 

In Table 4, we report the p-value for each Ajax, 
ie. the fraction of mocks whose Amax exceeds the 
“data” value. No statistically significant deviation from 
Amax = 0 is seen. 

Our second summary statistic is intended to test 
whether the 12-component vector A; is consistent with 
a multivariate Gaussian distribution. We define: 


Y= 5 A; Cov (Aj, Ax) t Ar, (B9) 


i,i 


where the covariance Cov(A;, Ay) is estimated from 
mock FRB catalogs, constructed as described above. 

As before, to assign statistical significance, we com- 
pare the “data” value of X? to an ensemble of mocks and 
report the associated p-value in Table 4. We find border- 
line evidence for x? 4 0 for DESI-BGS (p = 0.030), but 
interpret this as inconclusive, since Table 4 contains six 
p-values, so one p-value as small as 0.03 is unsurprising 
(this happens with probability ~0.18). 

Finally, we compare the set of 12 A; values to a jack- 
knife distribution, obtained by randomly splitting the 
FRB catalog in half. We do this comparison using the 
2-sample Kolmogorov-Smirnov (KS, Hodges 1958) and 


WISExSCOS 
DESI-BGS 
DESI-LRG 


=i E 0 2 4 
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Figure 15. Histograms of the statistic A; for the 12 null 
tests (filled markers) and 100 jackknives (lines). Using the 
KS and AD tests, the distributions are found to be consistent 
(Appendix B). 


Anderson-Darling (AD, Scholz & Stephens 1987) tests. 
Figure 15 compares the two distributions for the three 
galaxy samples, and the last two columns of Table 4 
summarize our results. As in the previous paragraph, 
there is one outlier: the WISExSCOS KS p-value is 
0.037, which we interpret as inconclusive, since it is one 
out of six p-values in the table (as in the previous para- 
graph). 

Summarizing this appendix, we do not find statisti- 
cally significant evidence that the FRB-galaxy cluster- 
ing signal studied in this paper depends on any of the 
parameters in Table 3. 


C. TAIL-FITTING PROCEDURE 


In 83.5, we assign statistical significance of the FRB- 
galaxy detection, by defining a frequentist statistic 
SNRmax, and ranking the “data” value SNR(3?*) within 
a histogram of simulated values SNR(?9*9, This pro- 
cedure is conceptually straightforward, but there is a 
technical challenge: because SNR‘@2**) turns out to be 
an extreme outlier, a brute-force approach requires an 
impractical number of simulations. "Therefore, we fit 
the tail of the SNR(?9*9 distribution to an analytic dis- 
tribution and assign statistical significance (or p-value) 
analytically. 

Empirically, we find that the top 10% of the SNR(n9c9 
distribution agrees well with the top 1096 of a Gaussian 
distribution, as shown in Figure 16. The parameters of 
the Gaussian distribution were determined as follows. 


Let p(z|u, 0) denote a Gaussian distribution with mean 
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Parameter Median Ai Median Ai 
(WISExSCOS) WISExSCOS (DESI) DESI-BGS DESI-LRG 

DM [pe cm 7] 535.08 0.33 536.41 —1.68 —0.95 
SNR 20.2 0.58 20.2 0.00 0.42 
Scattering time [ms] 1.331 1.49 1.423 0.93 0.38 
Pulse width [ms] 0.988 0.59 1.052 0.99 0.24 
Spectral index 2.866 0.68 2.075 0.76 —0.25 
Fluence [Jy ms] 3.503 1.28 3.115 —1.00 2.16 
Bandwidth [MHz] 332.09 —0.44 358.09 0.80 1.58 
Galactic |b| 38° 26 0.59 38°24 —1.27 —1.95 
Catalog localization error 10712 0.52 9/53 2.16 1.19 
TOA — 58528 [MJD] 0.3686595 0.99 4.8473498 1.16 0.63 
Peak frequency [MHz] 463.525 —0.63 449.036 1.97 1.30 
Time of day [hr] 9.887 1.22 10.132 —0.21 1.30 


Table 3. 


Null tests in Appendix B. For each parameter, we split the FRB catalog into “low” and “high” subcatalogs by 


comparing the parameter value to its median. (The median value is slightly different for FRBs in the WISExSCOS and DESI 
footprints.) We correlate both subcatalogs with the galaxy surveys, and compute the statistic A = (Ar)r-1ooo (defined in 
Eq. B7), which measures consistency of the FRB-galaxy correlation in “sigmas” . 


Galaxy sample Amax p-value 2 p-value KS p-value AD p-value 

WISExSCOS 1.49 0.779 9.26 0.659 0.037 0.067 
DESI-BGS 2.16 0.270 22.96 0.030 0.381 0.250 
DESI-LRG 2.16 0.274 17.26 0.145 0.113 0.171 


Table 4. Summary statistics for the 12 null tests in Table 3. As described in Appendix B, we reduce the 12-component vector 
A, into two scalar summary statistics Amax, x shown in the first four columns along with associated p-values from an ensemble 
of mocks. The last two columns compare the A; values for each galaxy sample to a “jackknife” ensemble defined by randomly 


splitting the CHIME/FRB catalog. 


u and variance c?: 


T,0) = € . 
p(z|p, o) pr (C10) 
Let X, C R be the top 10% of the simulated SNR(moc) 


values, and let X. be the bottom 90%. Let Ho € R be 
the 90th percentile of the SNR(99*9 distribution. Then, 
we choose parameters (u, g) to maximize the likelihood 


function: 


log £(r|u, o) = ( 5 logp(ælmo)) 


sE 


Mo 
+ |-| log f p(z|u, o), 


— oo 


(C11) 


where x denotes mock realizations. This likelihood func- 
tion has been constructed to fit parameters to the details 
of the X, values, while putting all X. values into a sin- 
gle coarse bin. 

Figure 16 is a good visual test for goodness of fit, but 
as a more quantitative test, we compare the upper 10% 
of the simulated histogram with the Gaussian fit using 


a KS test. We find that the two distributions agree to 
lo (and likewise for the other two cases, DESI-BGS and 
DESI-LRG). 

In Table 5, we compute statistical significance for 
each of the three surveys, in two different ways. The 
“brute-force” p-value is obtained by counting the num- 
ber of simulated SNR(?9*9 values (out of 10 total 
simulations) that exceed SNR. The “analytic” p- 
value is obtained by fitting the top 10% of the simu- 
lated SNR&@°°) values to a Gaussian distribution, as 
described above, and evaluating the CDF of the distri- 
bution at SNR‘2**), The brute-force values are either 
uninformative (for WISExSCOS), or have large Pois- 
son uncertainties (for the other two surveys), so we have 
quoted the analytic p-values as our “bottom-line” detec- 
tion significances throughout the paper. 
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Figure 16. Gaussian fit to the tail of the SNR&@°™) dis- 
tribution from Appendix C. For the top ~10% of the sam- 
ples (i.e. to the right of the dotted line) the agreement be- 
tween the fit and the simulations is excellent. This plot is for 
WISEx SCOS; the other two cases (DESI-BGS, DESI-LRG) 
are similar. 


Survey Brute-force Analytic 
WISEx SCOS 0/10000 2.7 x 107° 
DESI-BGS 4/10000 3.1 x 1074 
DESI-LRG 5/10000 4.1 x 107* 


Table 5. “Brute-force” and analytic p-values, computed as 
described in Appendix C. 


